Classifying Rna Secondary Structures Using Support Vector
نویسندگان
چکیده
CLASSIFYING RNA SECONDARY STRUCTURES USING SUPPORT VECTOR MACHINES by PrathyUsha Sunkara In contrast to DNA, RNA prevails as a single strand. As a consequence of small selfcomplementary regions, RNA commonly exhibits an intricate secondary structure, consisting of relatively short, double helical segments alternated with single stranded regions. The amount of sequence data available is rising rapidly day by day. One of the problems encountered on a specific molecule is finding the relevant data between the massive number of other sequences to be done by reading lists with a short description of all new entries in large databases already existing. One of the main objectives of this work is to take the extracted structures of aligned ribosomal RNA sequences and their secondary structures and cluster them. The proposal is to apply existing dimensionality reduction algorithms to these extracted structures and then cluster them in a reduced dimensional space using Support Vector Machines. CLASSIFYING RNA SECONDARY STRUCTURES USING SUPPORT VECTOR MACHINES by PrathyUsha Sunkara A Thesis Submitted to the Faculty of New Jersey Institute of Technology In Partial Fulfillment of Requirements for the Degree of Master of Science in Computer Science Department of Computer Science
منابع مشابه
Classifying RNA-Binding Proteins Based on Electrostatic Properties
Protein structure can provide new insight into the biological function of a protein and can enable the design of better experiments to learn its biological roles. Moreover, deciphering the interactions of a protein with other molecules can contribute to the understanding of the protein's function within cellular processes. In this study, we apply a machine learning approach for classifying RNA-...
متن کاملMarginalized kernels for RNA sequence data analysis.
We present novel kernels that measure similarity of two RNA sequences, taking account of their secondary structures. Two types of kernels are presented. One is for RNA sequences with known secondary structures, the other for those without known secondary structures. The latter employs stochastic context-free grammar (SCFG) for estimating the secondary structure. We call the latter the marginali...
متن کاملComparison of classic regression methods with neural network and support vector machine in classifying groundwater resources
In the present era, classification of data is one of the most important issues in various sciences in order to detect and predict events. In statistics, the traditional view of these classifications will be based on classic methods and statistical models such as logistic regression. In the present era, known as the era of explosion of information, in most cases, we are faced with data that c...
متن کاملHigh performance of the support vector machine in classifying hyperspectral data using a limited dataset
To prospect mineral deposits at regional scale, recognition and classification of hydrothermal alteration zones using remote sensing data is a popular strategy. Due to the large number of spectral bands, classification of the hyperspectral data may be negatively affected by the Hughes phenomenon. A practical way to handle the Hughes problem is preparing a lot of training samples until the size ...
متن کاملDetermine ncRNA structure shape using context-free grammar and support vector machine
Non-coding RNA molecules perform their cellular roles through their primary sequences as well as secondary structures. A model of secondary structure is required to determine mechanism of actions of ncRNA sequences. Secondary structure prediction is mainly determining all the possible base pairs of a given ncRNA sequence. In this work, we first make use of a structure prediction tool, Mfold to ...
متن کامل